UNCLASSIFIED 


>»  410766 


DEFENSE  DOCUMENTATION  CENTER 

FOB 

SCIENTIFIC  AND  TECHNICAL  INFORMATION 

CAMERON  STATION.  ALEXANDRIA.  VIRGINIA 


ftOTXCl:  When  government  or  other  drawings,  speci¬ 
fications  or  other  dote  ere  used  for  toy  purpose 
other  then  in  connection  with  a  definitely  related 
government  procurement  operation,  the  U.  8. 
Qoveruaent  thereby  incurs  no  responsibility,  nor  any 
oblijation  vhatsoerer;  and  the  fact  that  the  Ooreiu- 
■ent  any  have  formulated,  furnished,  or  in  any  way 
supplied  the  said  drawings,  specifications,  or  other 
data  is  not  to  be  regarded  by  implication  or  other¬ 
wise  as  in  any  Banner  licensing  the  holder  or  any 
other  person  or  corporation,  or  conveying  any  rights 
or  permission  to  aanufscture,  use  or  sell  any 
patented  invention  that  aay  in  any  way  be  related 
thereto. 


'm™!.OGED  by  DDC 
Ao  nO  No. 


410766 


SP-1262 


Is  Relevance  an  Adequate 
Criterion  in  Retrieval 
System  Evaluation? 


Lauren  B.  Doyle 


SP-1262 


Is  Relevance  an  Adequate 
Criterion  in  Retrieval 
System  Evaluation? 

Lauren  B.  Doyle 
July  1,  1963 

SYSTEM  DEVELOPMENT  CORPORATION,  SANTA  MONICA, 


A.llft 


July  1,  1963  1  SP-1262 

(page  2  blank) 


ABSTRACT 


It  is  argued  that  the  use  of  "relevance  to  a  search 
request"  as  a  criterion  of  what  a  system  retrieves 
is,  in  effect,  a  suboptimization  on  the  machine  side 
of  the  man-machine  interface,  and  that  the  searcher 
needs  an  efficient  exploratory  system  rather  than  a 
request- implementing  system. 


July  1,  1963 


3 


SP-1262 


IS  RELEVANCE  AN  ADEQUATE  CRITERION  IN  RETRIEVAL  SYSTJM  EVALUATION? 

Many  advances  in  science  have  come  as  a  result  of  questioning  concepts  and 
assumptions  which  have  previously  been  taken  for  granted.  Copernicus, 
Darwin,  and  Heisenberg  are  noted  as  much  for  the  ideas  they  swept  away 
as  for  the  new  ideas  they  brought  forth.  Today  we  are  quite  comfortable 
with  the  thought  that  the  earth  is  not  the  center  of  the  universe:  but 
this  notion  died  hard  in  the  century  of  Copernicus.  And  Just  as  the  l4th 
century  philosophers  constructed  epicycles  to  maintain  the  integrity  of 
the  earth-centered  planetary  system,  so,  in  general,  will  people  perform 
near-unreasonable  mental  gyrations  to  defend  a  conceptual  habit  which  new 
knowledge  and  events  are  making  untenable. 

The  concept  of  "relevance”  has  gained  in  importance  in  recent  times  along 
with  the  trend  toward  tighter  evaluation  of  retrieval  systems.  In  evalu¬ 
ation  one  has  to  compare  system  performance  to  seme  "ideal"  or  other  kind 
of  standard,  and  in  the  case  of  document  retrieval  the  ideal  of  performance 
has  sometimes  been  put:  "To  retrieve  all  and  only  those  documents  the^^ 
searcher  would  regard  as  relevant  to  his  need  if  he  could  personally  in¬ 
spect  every  document  in  the  library.-"  To  duplicate  this  ideal  in  the  practice 
°i  evaluation,  it  has  been  customary  to  assemble  Judges  to  inspect  docu-  ' 
ments  and  concur  among  themselves  as  to  what  documents  the  system  should 
retrieve  in  response  to  a  given  request. 

Hit  this  procedure  contains  an  enormouB  hidden  flaw,  implicit  in  the  fore¬ 
going  phrase  in  response  to  a  given  request , "  which  may  cause  many  current 
system  evaluations  to  be  looked  upon— in  the  course  of  history— as  comparable 
to  the  epicycles  of  the  l4tb  century.  Other  casualties  may  be  relevance, 
scales  of  relevance,  and  most  definitely  relevance  numbers.  The  flaw  is 
that  there  may  be  a  great  difference  between  relevance  to  a  given  request 
statement  and  relevance  to  a  person's  real  information  need.  It  is  a 
hidden  flaw  because  in  a  real  information  search  it  is  the  request  state- 
ment-the  outward  expression  of  the  information  need-rather  than  the  need 
itself  which  comes  to  the  surface. 

If  the  search  request  statement  is  usually  only  an  approximate  statement 
of  the  searcher's  need,  then  the  subset  of  documents  pronounced  by  the 
Judges  as  relevant  to  the  request  would  usually  be  different  from  that  which 
the  searcher  himself  would  choose  "if  he  could  personally  inspect  every 
document.  Taylor  (l)  is  one  of  several  people  who  have  recently  pointed 
out  the  distinction  between  relevance  to  a  need  and  relevance  to  a  request. 

His  portrayal  of  what  the  situation  might  be  like  is  interesting:  "Several 
inquirers  may,  for  example,  ask  the  same  question— or  what  is  assumed  to 
be  the  same  question-and  each  receives,  as  answer,  the  same  set  of  messages. 
However,  on  scanning,  each  person  picks  out  a  different  subset  of  the  total 
package... as  relevant  to  his  question.  Although  their  verbally  stated 
questions  were  the  same,  it  is  obvious  that  each  inquirer  must  have  had 
a  different  need.  Yet  we  prescribe  the  same  medicine  for  each." 


July  1,  1963 


1* 


SP-1262 


If  Taylor's  imaginary  situation  were  materialized— as  some  sort  of  ex¬ 
periment  --we  would  probably  find  the  subsets  greatly  overlapping;  surely 
there  would  be  a  few  documents  that  all  the  inquirers  would  agree  were 
relevant.  Nevertheless,  we  still  ought  to  be  deeply  concerned  about  the 
non-overlapping  parts  of  such  subsets,  because  they  would  reflect  the 
general  inability  of  searchers  to  ask  the  right  questions  of  the  system. 

There  exists  a  lurking  feeling  among  retrieval  thinkers,  which  is  hard 
to  dispel  by  argument,  that  a  searcher's  initial  try  at  constructing  a 
request  is  seldom  more  than  a  crude  approximation  of  whatever  mechanism 
will  give  him  the  document  subset  that  optimally  fulfills  his  need.  Taylor, 
along  with  Bar-Hillel  (2)  and  Cheydleur  (3),  believes  that  feedback  from 
the  information  store  (or  its  representation)  must  be  provided  by  the  re¬ 
trieval  system,  so  that  the  searcher  can  redefine  his  need  in  a  series  of 
iterations  (this  very  process  frequently  occurs  whenever  a  searcher  goes  to 
a  human  expert  in  pursuit  of  information).  Stiles  (4)  revealed  that  he 
too  is  experimenting  with  what  might  be  called  "man-system  interaction." 

This  is  particularly  interesting  because  Stiles  was  one  of  the  first  to 
try  out  associative  methods  of  machine  searching,  wherein  a  computer  yields 
not  only  references  which  directly  satisfy  a  searcher's  request  but  also 
references  whose  index  tags  are  strongly  associated  (statistically)  with 
the  tags  making  up  the  request.  In  other  words,  even  in  Stiles'  system, 
which  was  designed  to  use  statistics  in  getting  around  the  problem  of 
search-request  inadequacy,  there  is  still  felt  to  be  a  need  for  feedback. 

In  visualizing  systems  of  this  kind  we  can  feel  the  usefulness  of  the  con¬ 
cept  of  relevance  slipping  through  our  fingers.  We  now  become  aware  that 
the  "most  relevant  subset"  is  not  only  an  individual  matter  for  the  searcher, 
dependent  on  the  time  and  circumstances  of  his  searching  foray,  but  also 
that  the  feedback  he  gets  is  quite  capable  of  changing  his  idea  of  what 
he  wants  as  well  as  changing  his  way  of  expression.  An  "information 
need"  is  thus  revealed  to  be  a  dynamic  entity,  whose  times  of  greatest  dyna¬ 
mism  and  change  may  come  in  the  very  process  of  interacting  with  a  re¬ 
trieval  system.*  How  does  one  make  use  of  the  concept  of  relevance  in  a 
situation  like  this? 

No,  I  am  afraid  that  once  more,  as  in  the  days  when  the  earth  was  regarded 
as  the  center  of  the  universe,  we  are  succumbing  to  egocentricity  when  we 
view  a  man  as  commanding  a  machine  to  do  his  bidding  in  searching  the  liter¬ 
ature.  (This  is  the  essence  of  the  "request.")  The  question  is  not  only 
whether  a  man  knows  how  to  command, but  whether  command  is  at  all  an  appro¬ 
priate  kind  of  interaction  between  man  and  retrieval  system.  When  we  read 
a  book,  do  we  command  it?  No,  we  either  follow  it  or  explore  it.  Exploratory 
capabilility,  as  it  turns  out,  is  provided  by  traditional  libraries,  but 
not  by  some  of  our  modern  machine  literature  searching  schemes. 


*  People  who  consult  permuted  title  indexes,  where  feedback  in  effect  is  frozen 
into  the  structure,  are  often  led  astray  and  find  themselves  asking:  "Now,  let's 
see,  what  was  it  I  started  out  to  look  for?" 


July  l,  1963 


5 


SP-1262 


Also,  we  have  adopted  the  mental  habit  of  looking  at  things  from  what  la 
assumed  to  be  the  searcher's  viewpoint,  end  not  (for  example)  from  the  docu¬ 
ment's  viewpoint,  and  hence  ultimately  from  the  author's.  Perhaps  the  author 
has  as  much  of  a  right  to  be  served  as  the  searcher,  i.e.,  in  order  that  his 
articles  should  be  retrieved  by  "relevant  readers."  And  perhaps  in  that  same 
sense  the  information  store  is  as  Interested  In  searching  the  searcher  as  the 
searcher  is  In  searching  the  information.  After  all,  the  Information  re¬ 
quirement  which  activates  the  searcher  to  use  the  system  may  not  be  the  only 
one  on  his  mind;  it  seems  to  me  reasonable  that  an  information  user's  mind 
might  contain  a  whole  series  of  standing  requests,  many  of  which  he  is  not 
aware  of  at  the  moment  of  using  a  retrieval  system.  A  system  which  functions 
in  an  exploratory  way,  however,  has  a  good  chance  of  fulfilling  some  of  these 
requests.  Cheydleur  (3)  speaks  of  "rapport"  between  the  system  and  the  user. 
Die  most  elemental  example  of  a  "system"  which  permits  exploration  without 
barring  the  mechanics  of  requesting  is  permuted  title  indexing. 

Assuming  that  arguments  of  the  foregoing  type  can  convince  people  that  rele¬ 
vance  is  not  a  measurable  quantity,  like  weight  or  length,  by  which  we  can 
determine  the  excellence  of  retrieval  to  the  third  decimal  place,  should  we 
then  liquidate  relevance  as  a  concept,  finding  another  to  put  in  its  placet . 
Probably  not.  In  the  first  place,  people  will  no  more  give  up  the  concept 
of  relevance  tomorrow  morning  than  they  would  give  up  epicycles  instantly  in 
response  to  Copernicus,  nor  predictability  instantly  in  response  to  Heisenberg. 
Relevance  is  a  thought-crutch;  with  it  we  may  think  inaccurately  about  the 
retrieval  problem,  but  without  it  (or  something  better)  we  couldn't  think  at 
all. 

Hie  concept  of  relevance  does  more  good  than  ill,  unless  we  take  it  too 
literally;  it  allows  us  to  suboptimize  on  the  machine  side  of  the  man-machine 
Interface,  which  can  be  a  productive  thing  to  do  as  long  as  one  is  not  unaware 
that  he  is  suboptimizing.  But  in  the  long  run,  for  the  best  satisfaction  of 
the  searcher,  as  well  as  for  the  satisfaction  of  all  concerned,  we  must  also 
study  the  human  side  of  the  interface,  and  study  especially  both  sides  in 
interaction.  Vlhen  the  need  for  exploratory  capability  in  a  retrieval  system 
is  then  acknowledged,  what  concept  will  supplant  relevance,  or  at  least 
supplement  it?  I  would  suggest  "sharpness  of  separation  of  the  exploratoiy 
regions  in  which  the  Bearcher  finds  documents  of  interest  from  those  in  which 
he  does  not  find  such  documents."  Rote  that  this  criterion  gears  itself  not 
Just  to  a  particular  request,  nor  even  to  a  particular  need,  but  to  the  full 
dynamic  possibilities  of  a  human  using  a  retrieval  system. 

"Relevance"  will  serve  its  purpose,  but  will  decline  as  the  realization  slowly 
comes  that  an  individual's  information  need  is  so  complex  that  it  cannot  be 
accurately  stated  in  a  simple  request,  Die  fact  that  people  do  request  infor¬ 
mation  in  simple  terms  is  a  reflection  of  the  inadequacy  of  both  people  and 
systems,  and  not  a  reflection  of  the  true  structure  of  the  need.  Die  gradually 
increasing  awareness  of  a  human's  incapability  of  stating  bis  true  need  in 
simple  form  will  tend  to  pull  the  rug  out  from  under  many  IR  system  evaluation 
studies  which  will  have  been  done  in  the  meanwhile. 


July  1,  1963 


6 

(lwt  page) 


SP-I262 


t 


) 

) 


References 


1.  Taylor,  Robert  S.  1962.  The  Process  of  Asking  Questions.  Anerican 

Documentation,  13(4):  391-396.  — 

2.  Bar-Hillel,  Y.  i960.  Seme  Theoretical  Aspects  of  the  Mechanization 

of  Literature  Searching.  Technical  Report  No.  3.  April.  U.  8.  Office 
of  Naval  Research,  Washington,  D.  C. 

3.  Cheydleur,  B.  F.  1961.  Information  Retrieval— 1966.  Datamation. 
7(10):  21-25. 

4.  Stiles,  H.  E.  1963.  Informal  communication  at  a  meeting  on  associative 
indexing  held  in  January  at  the  Rational  Science  Foundation. 

Also: 

Stiles,  H.  E.  1961.  The  Association  Factor  in  Information  Retrieval. 
Journal  of  the  Association  for  Computing  Machinery,  8(2):  271-279* 


0 


0 


UNCLASSIFIED 


System  Development  Corporation, 

Santa  Monica,  California 
IS  RELEVANCE  AN  ADEQUATE  CRITERION  IN 
RETRIEVAL  SYSTEM  EVALUATION? 

Scientific  rept.,  SP-1262,  by  L.  B.  Doyle. 
1  July  1963,  6p.,  5  raft. 

Unclassified  report 

DESCRIPTORS:  Information  Retrieval. 

Argues  that  the  use  of  "relevance  to 
search  a  request"  as  a  criterion  of 
what  a  system  retrieves  is,  in  effect, 


a  suboptimisation  on  the  machine  side 
of  the  man-machine  interface,  and  that 
the  searcher  needs  an  efficient 
exploratory  system  rather  than  a 
request -implementing  system. 


UNCLASSIFIED 


UNCLASSIFIED 


^CLASSIFIED 


v  m^marnm  mmmwmmwm 


UNCLASSIFIED 


UNCLASSIFIED 


i 


